IHS-193 Add support for file upload/download for CoreFileObject#792
IHS-193 Add support for file upload/download for CoreFileObject#792gmazoyer merged 21 commits intoinfrahub-developfrom
CoreFileObject#792Conversation
WalkthroughAdds end-to-end file upload/download support to the SDK. New GraphQL multipart helpers (MultipartBuilder, multipart POST/request builders) and extended GraphQL types support file uploads. Introduces PreparedFile, FileHandler/FileHandlerSync for streaming uploads and downloads. InfrahubClient/InfrahubClientSync gain multipart upload execution, proxy-building, and streaming GET helpers. Node classes gain file state and APIs (is_file_object, upload_from_path, upload_from_bytes, clear_file, download_file) and validation. Protocol types include CoreFileObject. Extensive unit tests and pylint exceptions were added. 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying infrahub-sdk-python with
|
| Latest commit: |
66c9871
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://6650b7f0.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://gma-20260202-ihs193.infrahub-sdk-python.pages.dev |
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## infrahub-develop #792 +/- ##
====================================================
+ Coverage 80.36% 80.46% +0.09%
====================================================
Files 115 117 +2
Lines 9865 10222 +357
Branches 1504 1546 +42
====================================================
+ Hits 7928 8225 +297
- Misses 1415 1462 +47
- Partials 522 535 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Use two distinct methods instead of one to clearly show intent: * select_content_for_upload * select_file_for_upload
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@infrahub_sdk/file_handler.py`:
- Around line 98-118: The handler handle_error_response should guard against
non-JSON responses by wrapping each exc.response.json() call in a try/except
(catching json.JSONDecodeError or ValueError) and falling back to
exc.response.text; for 401/403, if json parsing fails treat the body text as a
single message (or empty) when building the AuthenticationError message, and for
404 use the fallback text as the detail when constructing NodeNotFoundError;
preserve the original HTTPStatusError as the cause (raise ... from exc).
In `@infrahub_sdk/node/node.py`:
- Around line 1160-1163: The error raised when self._file_object_support is true
and self._file_content is None references the wrong helper method name; update
the ValueError text in the block that checks self._file_object_support /
self._file_content (near usage of self._schema.kind) to instruct callers to use
the correct methods select_file_for_upload() or select_content_for_upload()
instead of set_file().
- Around line 2049-2052: The error message inside the check for
self._file_object_support and self._file_content should reference the actual
synchronous setter method name instead of the non-existent set_file(); update
the message to instruct callers to use the real sync setter (e.g.,
set_file_content()) — modify the ValueError string that currently mentions
set_file() so it references set_file_content() (or the actual sync method used
in this class) and keep the rest of the message intact; target the block that
checks _file_object_support and _file_content and update the message formatting
around _schema.kind accordingly.
🧹 Nitpick comments (1)
infrahub_sdk/client.py (1)
997-1035: Avoid mutating caller-providedvariablesin file-upload helpers.Both async and sync paths mutate the incoming dict. Prefer a new dict to avoid side effects.
Suggested refactor
- variables = variables or {} - variables["file"] = None + variables = {**(variables or {}), "file": None} @@ - variables = variables or {} - variables["file"] = None + variables = {**(variables or {}), "file": None}Also applies to: 2055-2093
|
I would prefer the |
e9e0f9e to
1b304a6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@infrahub_sdk/protocols.py`:
- Around line 110-115: The CoreFileObject class (and the other affected
generated blocks referenced at lines 234-237, 675-680, 799-802) were edited
manually but this file is generated; revert any hand edits to these definitions
and re-run the generation task or update the source schema so the generator
emits the correct definitions (e.g., restore CoreFileObject and the other
generated classes/fields to match the generator output), do not make direct
changes in infrahub_sdk/protocols.py, and commit the regenerated file instead.
🧹 Nitpick comments (1)
infrahub_sdk/node/node.py (1)
224-288: Reuse_validate_file_object_supportto avoid duplicated checks.
upload_from_pathandupload_from_bytesrepeat the same FeatureNotSupportedError logic; reusing the helper keeps error messaging consistent and centralized.♻️ Suggested refactor
- if not self._file_object_support: - raise FeatureNotSupportedError( - f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads." - ) + self._validate_file_object_support( + message=( + f"File upload is not supported for {self._schema.kind}. " + "Only nodes inheriting from CoreFileObject support file uploads." + ) + ) @@ - if not self._file_object_support: - raise FeatureNotSupportedError( - f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads." - ) + self._validate_file_object_support( + message=( + f"File upload is not supported for {self._schema.kind}. " + "Only nodes inheriting from CoreFileObject support file uploads." + ) + )Also applies to: 509-512
It was agreed to use `upload_from_path(path)` and `upload_from_bytes(content, name)`. Also update protocols which raised, it seems a valid error.
1b304a6 to
72047a7
Compare
ogenstad
left a comment
There was a problem hiding this comment.
It would be good with some integration tests once we have things updated on the Infrahub side.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@infrahub_sdk/client.py`:
- Around line 997-1033: Summary: The method _execute_graphql_with_file mutates
the caller's variables dict by assigning variables["file"] = None; make a copy
first to avoid leaking state. Fix: inside _execute_graphql_with_file (and the
other similar block) replace the in-place modification with a shallow copy
(e.g., vars_copy = dict(variables) or variables.copy() after handling None) and
set vars_copy["file"] = None, then use vars_copy for the request; ensure you
handle variables being None before copying. Reference symbols:
_execute_graphql_with_file, variables, "file".
- Around line 1060-1090: The multipart upload retries can send an empty file
because the passed file-like object has been consumed; in the async
_post_multipart method (decorated with handle_relogin) seek the incoming
file_content to the start (file_content.seek(0)) before calling
MultipartBuilder.build_payload so the payload is built from the beginning of the
stream, and apply the identical change to the sync counterpart (the sync
_post_multipart or similarly named method used elsewhere) and any other
occurrences (the other multipart methods around the second block referenced) so
retries re-upload the full file.
In `@infrahub_sdk/ctl/branch.py`:
- Around line 122-123: The two occurrences of "pc.created_by" (used in
proposed_change_table.add_row and format_timestamp) currently use a blanket "#
type: ignore[attr-defined]" without justification; fix this by either updating
the type definitions for the object that pc is (so created_by is declared on
that model) or by replacing the ignore with an explicit, checkable
assertion/cast (e.g., assert isinstance(pc, ExpectedProposedChange) or cast to
the model type) and add a short inline comment explaining why created_by is
guaranteed to exist (e.g., "created_by is populated by the API vX response for
created proposed changes"); update the annotation for pc or add the assert/cast
next to the accesses of pc.created_by so mypy no longer needs the ignore and
future readers see the rationale.
In `@infrahub_sdk/node/node.py`:
- Around line 249-275: The docstring example for upload_from_bytes is unsafe
because it stores a file handle in self._file_content and then closes it (via
the with block) before node.save() runs, causing I/O errors; update the example
and text in upload_from_bytes to show one of two correct patterns: (a) read the
file into bytes and call upload_from_bytes(content=f.read(), name=...) outside a
with, or (b) keep the handle open and call node.save() inside the with block
(e.g., with open(..., "rb") as f: node.upload_from_bytes(content=f, name="...");
node.save()). Mention that this applies to nodes inheriting from CoreFileObject
and reference upload_from_bytes, self._file_content and save() in the docstring.
In `@tests/unit/sdk/test_file_object.py`:
- Around line 78-105: The test uses type: ignore[union-attr] on attributes like
node.contract_start and node.contract_end in
test_node_create_with_file_uses_multipart (and the other listed tests) without
justification; replace each bare ignore by either asserting/casting the node to
the correct runtime type or adding a brief justification comment: e.g., use an
explicit runtime assertion (assert isinstance(node, InfrahubNode) or cast via
typing.cast) before accessing .contract_start/.contract_end, or add a one-line
comment explaining why the union attribute is present and safe (e.g., "# safe:
test constructs node with schema that guarantees contract_start attribute");
update the occurrences around InfrahubNode/InfrahubNodeSync instantiation so
mypy no longer needs type: ignore or the ignore is accompanied by the short
justification comment.
🧹 Nitpick comments (3)
infrahub_sdk/node/node.py (3)
282-288: Consider moving async/sync file upload preparation to their respective classes.
_get_file_for_uploadis anasync defonInfrahubNodeBase, which meansInfrahubNodeSyncinherits an async method it never uses. Similarly,_get_file_for_upload_syncis inherited byInfrahubNode. This pattern follows from the existing codebase but the guideline says to never mix async/sync inappropriately. Consider moving_get_file_for_uploadtoInfrahubNodeand_get_file_for_upload_synctoInfrahubNodeSyncto keep the separation clean, similar to how_process_hierarchical_fieldsis defined separately on each class.
791-824: Consider adding@overloadsignatures for better type narrowing.
FileHandler.downloadalready uses@overloadto distinguishbytes(no dest) vsint(with dest). Without matching overloads here, callers always receivebytes | intand must manually narrow. This is a minor DX concern.Example overloads
from typing import overload `@overload` async def download_file(self, dest: None = None) -> bytes: ... `@overload` async def download_file(self, dest: Path) -> int: ... async def download_file(self, dest: Path | None = None) -> bytes | int: ...
1186-1210: Extract duplicated file-upload mutation logic into a helper.The file-upload-or-regular-mutation block is repeated verbatim in
createandupdate(both async and sync — 4 copies total). Consider extracting a helper like_execute_mutation(query, input_data, tracker, timeout)that encapsulates the "file" branch, try/finally cleanup, clear_file, and the else-branch. This would reduce the surface area for bugs if the logic needs to change later.Also applies to: 1228-1252
This change adds support to create, update and retrieve nodes which schemas implement
CoreFileObject.The proposed user exposed API follows these key points:
node.upload_from_path(path)to select a file from disk for upload (streams during upload)node.upload_from_bytes(content, name)to set content for upload (supports bytes or BinaryIO)node.download_file(dest)to download a file to memory or stream to diskBeing able to stream a file to disk or from a disk is important in order to support large files and to avoid them being loaded completely into memory (which would be problematic for +1GB files in general).
The choice of using
upload_from_pathandupload_from_bytesis to prevent a collision with a potential attribute or relationship calledfilein the schema. That is also the reason why thefileGraphQL parameter is outside thedataone instead of inside it.Here we introduce a couple of components to try to make our code SOLID (it's not much for now, but it's honest work):
FileHandler/FileHandlerSyncdedicated classes for file I/O operationsMultipartBuilderGraphQL Multipart Request Spec payload buildingIt sure won't make our code SOLID but it won't add to the burden for now.
So given the user who loaded a schema, using our SDK will look like:
Upload a file when creating a node
Download a file from a node
Summary by CodeRabbit
New Features
Tests